Name | Version | Summary | date |
ai-edge-quantizer-nightly |
0.3.0.dev20250712 |
A quantizer for advanced developers to quantize converted AI Edge models. |
2025-07-12 00:13:52 |
fedcore |
0.0.5.3 |
Federated learning core library |
2025-07-09 21:11:50 |
llmcompressor |
0.6.0 |
A library for compressing large language models utilizing the latest techniques and research in the field for both training aware and post training techniques. The library is designed to be flexible and easy to use on top of PyTorch and HuggingFace Transformers, allowing for quick experimentation. |
2025-06-24 15:20:55 |
bitsandbytes |
0.45.5 |
k-bit optimizers and matrix multiplication routines. |
2025-04-07 13:32:52 |
llmcompressor-nightly |
0.4.1.20250314 |
A library for compressing large language models utilizing the latest techniques and research in the field for both training aware and post training techniques. The library is designed to be flexible and easy to use on top of PyTorch and HuggingFace Transformers, allowing for quick experimentation. |
2025-03-14 03:23:13 |
vector-quantize-pytorch |
1.22.2 |
Vector Quantization - Pytorch |
2025-03-11 21:43:33 |
kvquant |
0.0.1 |
More for Keys, Less for Values: Adaptive KV Cache Quantization 🐍🚀🎉🦕 |
2025-02-27 20:12:37 |
tensorflores |
0.1.5 |
The TensorFlores framework is a Python-based solution designed for optimizing machine learning deployment in resource-constrained environments. |
2025-02-14 11:18:47 |
gptqmodel |
1.8.1 |
A LLM quantization package with user-friendly apis. Based on GPTQ algorithm. |
2025-02-08 20:20:50 |
nncf |
2.15.0 |
Neural Networks Compression Framework |
2025-02-06 10:09:14 |
autoawq |
0.2.8 |
AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. |
2025-01-20 11:03:42 |
friendli-model-optimizer |
0.10.0 |
Model Optimizer CLI for Friendli Engine. |
2025-01-08 09:57:35 |
fmo-core |
0.9 |
Model Optimizer for Friendli Engine. |
2025-01-03 08:40:30 |
optimum-intel |
1.21.0 |
Optimum Library is an extension of the Hugging Face Transformers library, providing a framework to integrate third-party libraries from Hardware Partners and interface with their specific functionality. |
2024-12-06 12:25:14 |
autoawq-kernels |
0.0.9 |
AutoAWQ Kernels implements the AWQ kernels. |
2024-11-16 15:42:59 |
optimum |
1.23.3 |
Optimum Library is an extension of the Hugging Face Transformers library, providing a framework to integrate third-party libraries from Hardware Partners and interface with their specific functionality. |
2024-10-29 17:43:32 |
optimum-quanto |
0.2.6 |
A pytorch quantization backend for optimum. |
2024-10-29 17:13:31 |
owlite |
0.0.8 |
A fake package to warn the user they are not installing the correct package. |
2024-10-02 07:36:45 |
hquant |
0.1.0 |
High quality quantization for Pillow images. |
2024-09-19 23:02:03 |
bitlinear-pytorch |
0.5.0 |
Implementation of the BitLinear layer from: The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits |
2024-09-11 15:26:31 |